A Novel Architecture for Data Mining Grid Scheduler

نویسندگان

  • MEIQUN LIU
  • KUN GAO
  • ZHONG WAN
  • Qian Hu
چکیده

In order to improve the performance of Data Mining applications, an effective method is task parallelization. The scheduler on Grid plays an important role to management subtasks so as to achieve high performance. We introduce an additional component that we call serializer, whose purpose is to decompose the tasks into a series of independent tasks according the directed acyclic graph (DAG), and send them to the scheduler queue as soon as they become executable with respect to the DAG dependencies. The experimental result demonstrates that the architecture has good performance. Key-Words: Scheduling Architecture, Knowledge Grid, Data Mining

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Workflow-based Tasks Scheduling on Grid

Due to the distributed nature of data and the need for high performance, it makes Grid a suitable environment for distributed data mining. Since distributed data mining applications are typically data intensive, one of the main requirements of such a DDM Grid environment is the efficient workflow scheduling. We propose an architecture for a Knowledge Grid scheduler that results in the minimal r...

متن کامل

Promoting performance and separation of concerns for data mining applications on the grid

Grid Computing brought the promise of making high-performance computing cheaper and more easily available than traditional supercomputing platforms. Such a promise was very well received by the data mining (DM) community, as DM applications typically process very large datasets and are thus very resource intensive. However, since the Grid is very dynamic and parallel data mining is prone to loa...

متن کامل

GridMiner: An Infrastructure for Data Mining on Computational Grids

Knowledge discovery in datasets integrated into Grids is a challenging research task. These large datasets are being collected and accumulated across a wide variety of fields, at a dramatical pace. They are often heterogeneous and geographically distributed and globally used by large user communities. There are major challenges involved in the efficient and reliable storage, fast processing, in...

متن کامل

SoPhIA: A Unified Architecture for Knowledge Discovery

This paper presents a novel architecture Soph.I.A (Sophisticated Intelligent Architecture), which integrates Knowledge Management and Data Mining into a unified Knowledge Discovery Process. Within SophIA Data Mining is driven by knowledge captured from domain experts. Knowledge Grid is briefly reviewed to envision the implementation of the proposed framework.

متن کامل

Design and implementation of a data mining grid-aware architecture

Current business processes often use data from several sources. Data is characterized to be heterogeneous, incomplete and usually involves a huge amount of records. This implies that data must be transformed in a set of patterns, rules or some kind of formalism, which helps to understand the underlying information. The participation of several organizations in this process makes the assimilatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008